{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "ff1ef672-bc9d-48ae-aa31-267f24026f6a",
   "metadata": {},
   "source": [
    "## 1. PP-LiteSeg模型简介\n",
    "\n",
    "语义分割作为视觉三大任务之一，在实际应用中具有广泛的需求。尽管基于深度学习的语义分割技术取得了重大进展，但是有时候语义分割模型的精度和性能无法同时满足业务需求。\n",
    "\n",
    "针对上述问题，PaddleSeg团队提出了一个新的轻量级实时语义分割模型PP-LiteSeg。具体来说，PP-LiteSeg模型中提出了轻量级解码器（FLD），以减少解码器的计算开销。为了加强特征表示，我们提出了统一注意力融合模块（UAFM），该模块利用空间和通道注意力来产生权重，然后将输入特征与权重融合。此外，我们提出了简易金字塔池化模块（SPPM），以低计算聚合全局上下文。\n",
    "\n",
    "在Cityscapes测试集上使用NVIDIA GTX 1080Ti进行实验，PP-LiteSeg的精度和速度可以达到 72.0% mIoU / 273.6 FPS 以及 77.5% mIoU / 102.6 FPS。与其他模型相比，PP-LiteSeg在精度和速度之间实现了SOTA平衡。\n",
    "\n",
    "PP-LiteSeg模型由飞桨官方出品，是PaddleSeg团队推出的SOTA模型。 更多关于PaddleSeg可以点击 [https://github.com/PaddlePaddle/PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg) 进行了解。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "55360c7a-3191-40bf-99c5-64c1c3d89967",
   "metadata": {},
   "source": [
    "## 2. 模型效果及应用场景\n",
    "\n",
    "### 2.1 实时语义分割任务\n",
    "\n",
    "#### 2.1.1 数据集\n",
    "\n",
    "数据集以Cityscapes为主，分为训练集和测试集。\n",
    "\n",
    "#### 2.1.2 模型效果速览\n",
    "\n",
    "PP-LiteSeg模型在测试图片上的分割效果如下。\n",
    "\n",
    "原图：\n",
    "<div align=\"center\">\n",
    "<img src=\"https://user-images.githubusercontent.com/48357642/201077761-3ebeda52-b15d-4913-b64c-0798d1f922a5.png\"  width = \"60%\"  />\n",
    "</div>\n",
    "\n",
    "分割后的图：\n",
    "<div align=\"center\">\n",
    "<img src=\"https://user-images.githubusercontent.com/48357642/201077985-29954838-9df6-4ab4-9f91-23e9a20be513.png\"  width = \"60%\"  />\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "be5dce27-7842-4af5-8ba7-8a1d71b31680",
   "metadata": {},
   "source": [
    "## 3. 模型如何使用\n",
    "\n",
    "### 3.1 模型推理\n",
    "\n",
    "* 安装PaddlePaddle\n",
    "\n",
    "安装PaddlePaddle，要求PaddlePaddle >= 2.2.0。由于图像分割模型计算开销大，推荐在GPU版本的PaddlePaddle下使用。\n",
    "\n",
    "在AIStudio中，大家选择可以直接选择安装好PaddlePaddle的环境。 如果需要执行安装PaddlePaddle，请参考PaddlePaddle官网。\n",
    " \n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "99436a4d-9c54-4b5e-b494-68424a09d7a5",
   "metadata": {},
   "source": [
    "\n",
    "* 下载PaddleSeg\n",
    "\n",
    "（不在Jupyter Notebook上运行时需要将\"！\"或者\"%\"去掉。）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a16f87a6-85ea-4050-a698-ea634db9c235",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%cd ~\n",
    "!git clone https://gitee.com/PaddlePaddle/PaddleSeg.git"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "846586a0-456d-49da-a4d3-6192f26c2e01",
   "metadata": {},
   "source": [
    "* 安装PaddleSeg"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a234d0dc-a4bd-48b2-af1b-168538b6d9b6",
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [],
   "source": [
    "# 安装PaddleSeg\n",
    "%cd ~/PaddleSeg\n",
    "!git checkout release/2.6\n",
    "!pip install -v -e ."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "01b72385-c22e-414a-8d3e-5fcaea6e34b4",
   "metadata": {},
   "source": [
    "* 快速体验"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3b63e945-38d8-45dd-a066-3b5c05fc06a2",
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [],
   "source": [
    "# 下载模型\n",
    "!wget https://paddleseg.bj.bcebos.com/inference/pp_liteseg_infer_models/pp_liteseg_stdc1_cityscapes_1024x512_scale1.0_160k_inference_model.zip\n",
    "!unzip pp_liteseg_stdc1_cityscapes_1024x512_scale1.0_160k_inference_model.zip\n",
    "# 下载测试图片\n",
    "!wget https://paddleseg.bj.bcebos.com/dygraph/demo/cityscapes_demo.png\n",
    "# 预测\n",
    "!python deploy/python/infer.py \\\n",
    "    --config ./pp_liteseg_stdc1_cityscapes_1024x512_scale1.0_160k_inference_model/deploy.yaml \\\n",
    "    --image_path ./cityscapes_demo.png \\\n",
    "    --save_dir output/result"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5d99268c-5dcf-47bb-97f7-c11b54ebed48",
   "metadata": {},
   "source": [
    "结果保存在`PaddleSeg/output/result/cityscapes_demo.png`（如下图）。\n",
    "\n",
    "<div align=\"center\">\n",
    "<img src=\"https://user-images.githubusercontent.com/48357642/201077985-29954838-9df6-4ab4-9f91-23e9a20be513.png\"  width = \"60%\"  />\n",
    "</div>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "40e633a3-8f48-4883-8de7-fc88b3cb7ea7",
   "metadata": {},
   "source": [
    "### 3.2 模型训练\n",
    "\n",
    "* 准备\n",
    "\n",
    "参考前文，安装PaddleSeg。参考[PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg)文档准备Cityscapes数据集，整理如下。\n",
    "\n",
    "```\n",
    "PaddleSeg/data\n",
    "├── cityscapes\n",
    "│   ├── gtFine\n",
    "│   ├── infer.list\n",
    "│   ├── leftImg8bit\n",
    "│   ├── test.list\n",
    "│   ├── train.list\n",
    "│   ├── trainval.list\n",
    "│   └── val.list\n",
    "```\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4fd01e45-577f-4357-af98-26d815403ea8",
   "metadata": {},
   "source": [
    "* 训练\n",
    "\n",
    "PP-LiteSeg模型的配置文件保存在`PaddleSeg/configs/pp_liteseg/`下。使用train.py脚本，我们设置相应的配置文件并开始训练模型。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "034f5d95-61c4-4ee0-9445-e432ba04b366",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "!export CUDA_VISIBLE_DEVICES=0,1,2,3\n",
    "!python -m paddle.distributed.launch train.py \\\n",
    "    --config configs/pp_liteseg/pp_liteseg_stdc1_cityscapes_1024x512_scale0.5_160k.yml \\\n",
    "    --save_dir output/pp_liteseg_stdc1_cityscapes_1024x512_scale0.5_160k \\\n",
    "    --save_interval 1000 \\\n",
    "    --num_workers 3 \\\n",
    "    --do_eval \\\n",
    "    --use_vdl"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f005c439-6677-439a-9e2d-9317f084d303",
   "metadata": {},
   "source": [
    "训练完成后，模型权重保存在`PaddleSeg/output/xxx/best_model/model.pdparams`中。\n",
    "\n",
    "模型训练的详细文档，可参考[模型训练](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.6/docs/train/train.md)。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3abb9723-e839-407f-a3dc-e9ed6bca5f45",
   "metadata": {},
   "source": [
    "## 4. 模型原理\n",
    "\n",
    "PP-LiteSeg模型的结构如下图。\n",
    "\n",
    "<div align=\"center\">\n",
    "<img src=\"https://user-images.githubusercontent.com/52520497/162148786-c8b91fd1-d006-4bad-8599-556daf959a75.png\"  width = \"50%\"  />\n",
    "</div>\n",
    "\n",
    "* 提出了灵活轻量的解码器（FLD）\n",
    "\n",
    "我们提出的灵活、轻量级的解码器（FLD），在增大特征图空间尺寸的时候，逐渐减少通道数。此外，FLD的计算量可以很容易地根据编码器进行调整。灵活的设计减轻了解码器的冗余，平衡了编码器和解码器的算量，使整个模型更高效。\n",
    "\n",
    "* 提出了统一注意力融合模块（UAFM）\n",
    "\n",
    "加强特征表达是提高分割精度的关键方法，大家通常通过融合解码器中的低层细节特征和深层语义特征来实现。然而，现有方法中的融合模块通常具有较高的计算成本。我们提出了统一的注意力融合模块（UAFM）来有效地增强特征表示。在UAFM中，有两种注意力模块，即通道和空间注意力模块。UAFM模块利用通道和空间注意力来增强特征表示。\n",
    "\n",
    "* 提出了简易金字塔池化模块（SPPM）\n",
    "\n",
    "上下文聚合是提高分割精度的另一个关键，但以前的聚合模块对于实时网络来说非常耗时。我们设计了一个简易的金字塔池模块（SPPM），该模块减少了特征图的中间通道和输出通道，删除了short cut链接，并用加法操作取代了级联操作。实验结果表明，SPPM以较小的额外推理时间提高了分割精度。\n",
    "\n",
    "在Cityscapes和CamVid数据集上，我们做了大量实验来评估PP-LiteSeg模型。PP-LiteSeg模型在分割精度和推理速度之间实现了最佳权衡。具体来说，PP-LiteSeg在Cityscapes测试集上实现了72.0% mIoU / 273.6 FPS 和 77.5% mIoU / 102.6 FPS。\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b23dc65f-eff4-4cad-98c8-9856c5cb9ee1",
   "metadata": {},
   "source": [
    "## 5. 学术引用\n",
    "\n",
    "如果我们的项目在学术上帮助到你，请考虑以下引用：\n",
    "\n",
    "```\n",
    "@article{peng2022pp-liteseg,\n",
    "  title={PP-LiteSeg: A Superior Real-Time Semantic Segmentation Model},\n",
    "  author={Juncai Peng, Yi Liu, Shiyu Tang, Yuying Hao, Lutao Chu, Guowei Chen, Zewu Wu, Zeyu Chen, Zhiliang Yu, Yuning Du, Qingqing Dang,Baohua Lai, Qiwen Liu, Xiaoguang Hu, Dianhai Yu, Yanjun Ma.},\n",
    "  journal={arXiv e-prints},\n",
    "  pages={arXiv--2204},\n",
    "  year={2022}\n",
    "}\n",
    "```\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "py35-paddle1.2.0"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
